AITopics | language production

Collaborating Authors

language production

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Meaningful Pose-Based Sign Language Evaluation

Jiang, Zifan, Leong, Colin, Moryossef, Amit, Göhring, Anne, Rios, Annette, Cory, Oliver, Ivashechkin, Maksym, Tarigopula, Neha, Zhang, Biao, Sennrich, Rico, Ebling, Sarah

arXiv.org Artificial IntelligenceOct-10-2025

We present a comprehensive study on meaningfully evaluating sign language utterances in the form of human skeletal poses. The study covers keypoint distance-based, embedding-based, and back-translation-based metrics. We show tradeoffs between different metrics in different scenarios through automatic meta-evaluation of sign-level retrieval and a human correlation study of text-to-pose translation across different sign languages. Our findings and the open-source pose-evaluation toolkit provide a practical and reproducible way of developing and evaluating sign language translation or generation systems.

computational linguistic, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.07453

Country:

Europe (1.00)
North America > United States (0.93)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.87)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Geometry-Aware Losses for Structure-Preserving Text-to-Sign Language Generation

Wu, Zetian, Zhou, Tianshuo, Lee, Stefan, Huang, Liang

arXiv.org Artificial IntelligenceSep-30-2025

Sign language translation from text to video plays a crucial role in enabling effective communication for Deaf and hard--of--hearing individuals. A major challenge lies in generating accurate and natural body poses and movements that faithfully convey intended meanings. Prior methods often neglect the anatomical constraints and coordination patterns of human skeletal motion, resulting in rigid or biomechanically implausible outputs. To address this, we propose a novel approach that explicitly models the relationships among skeletal joints--including shoulders, arms, and hands--by incorporating geometric constraints on joint positions, bone lengths, and movement dynamics. During training, we introduce a parent-relative reweighting mechanism to enhance finger flexibility and reduce motion stiffness. Additionally, bone-pose losses and bone-length constraints enforce anatomically consistent structures. Our method narrows the performance gap between the previous best and the ground-truth oracle by 56.51%, and further reduces discrepancies in bone length and movement variance by 18.76% and 5.48%, respectively, demonstrating significant gains in anatomical realism and motion naturalness.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.23011

Genre: Research Report (1.00)

Industry: Education > Curriculum > Subject-Specific Education (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Text2Sign Diffusion: A Generative Approach for Gloss-Free Sign Language Production

Feng, Liqian, Wang, Lintao, Hu, Kun, Kong, Dehui, Wang, Zhiyong

arXiv.org Artificial IntelligenceSep-16-2025

Sign language production (SLP) aims to translate spoken language sentences into a sequence of pose frames in a sign language, bridging the communication gap and promoting digital inclusion for deaf and hard-of-hearing communities. Existing methods typically rely on gloss, a symbolic representation of sign language words or phrases that serves as an intermediate step in SLP. This limits the flexibility and generalization of SLP, as gloss annotations are often unavailable and language-specific. Therefore, we present a novel diffusion-based generative approach - Text2Sign Diffusion (Text2SignDiff) for gloss-free SLP. Specifically, a gloss-free latent diffusion model is proposed to generate sign language sequences from noisy latent sign codes and spoken text jointly, reducing the potential error accumulation through a non-autoregressive iterative denoising process. We also design a cross-modal signing aligner that learns a shared latent space to bridge visual and textual content in sign and spoken languages. This alignment supports the conditioned diffusion-based process, enabling more accurate and contextually relevant sign language generation without gloss. Extensive experiments on the commonly used PHOENIX14T and How2Sign datasets demonstrate the effectiveness of our method, achieving the state-of-the-art performance.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2509.10845

Country: Oceania > Australia (0.14)

Genre: Research Report (1.00)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.70)

Add feedback

A Transformer-Based Framework for Greek Sign Language Production using Extended Skeletal Motion Representations

Pratikaki, Chrysa, Filntisis, Panagiotis, Katsamanis, Athanasios, Roussos, Anastasios, Maragos, Petros

arXiv.org Artificial IntelligenceMar-4-2025

Building on To address communication barriers between the DHH (Deaf and insights from previous research, we propose a deep learning model Hard-of-Hearing) and the hearing communities, the field of Sign for Sign Language Production (SLP), which to our knowledge is Language Processing has emerged at the intersection of linguistics, the first attempt on Greek SLP. We tackle this task by utilizing a computer vision, and machine learning. Sign Language Processing transformer-based architecture that enables the translation from encompasses a variety of tasks aimed at bridging the gap between text input to human pose keypoints, and the opposite. We evaluate DHH and hearing communities by enabling the automatic translation, the effectiveness of the proposed pipeline on the Greek SL dataset and generation of sign language. The most critical components Elementary23, through a series of comparative analyses and ablation of an effective sign language system are Sign Language Translation studies. Our pipeline's components, which include data-driven (SLT), and Sign Language Production (SLP). In this paper, we gloss generation, training through video to text translation and a primarily focus on Sign Language Production (SLP).

landmark, language production, sign language production, (14 more...)

arXiv.org Artificial Intelligence

2503.02421

Country:

Europe > Greece > Ionian Islands > Corfu (0.06)
Europe > Greece > Attica > Athens (0.05)
North America > United States > New York (0.04)

Genre: Research Report (0.50)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Beyond Words: AuralLLM and SignMST-C for Precise Sign Language Production and Bidirectional Accessibility

Li, Yulong, Zhang, Yuxuan, Tang, Feilong, Zhou, Mian, Lu, Zhixiang, Xue, Haochen, Wang, Yifang, Dang, Kang, Su, Jionglong

arXiv.org Artificial IntelligenceJan-1-2025

Although sign language recognition aids non-hearing-impaired understanding, many hearing-impaired individuals still rely on sign language alone due to limited literacy, underscoring the need for advanced sign language production and translation (SLP and SLT) systems. In the field of sign language production, the lack of adequate models and datasets restricts practical applications. Existing models face challenges in production accuracy and pose control, making it difficult to provide fluent sign language expressions across diverse scenarios. Additionally, data resources are scarce, particularly high-quality datasets with complete sign vocabulary and pose annotations. To address these issues, we introduce CNText2Sign and CNSign, comprehensive datasets to benchmark SLP and SLT, respectively, with CNText2Sign covering gloss and landmark mappings for SLP, and CNSign providing extensive video-to-text data for SLT. To improve the accuracy and applicability of sign language systems, we propose the AuraLLM and SignMST-C models. AuraLLM, incorporating LoRA and RAG techniques, achieves a BLEU-4 score of 50.41 on the CNText2Sign dataset, enabling precise control over gesture semantics and motion. SignMST-C employs self-supervised rapid motion video pretraining, achieving a BLEU-4 score of 31.03/32.08 on the PHOENIX2014-T benchmark, setting a new state-of-the-art. These models establish robust baselines for the datasets released for their respective tasks.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2501.00765

Country:

North America > United States > New York > New York County > New York City (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > Canada > Quebec > Montreal (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry:

Health & Medicine (1.00)
Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Learning to Write Rationally: How Information Is Distributed in Non-Native Speakers' Essays

Tang, Zixin, van Hell, Janet G.

arXiv.org Artificial IntelligenceNov-5-2024

People tend to distribute information evenly in language production for better and clearer communication. In this study, we compared essays written by second language learners with various native language (L1) backgrounds to investigate how they distribute information in their non-native language (L2) production. Analyses of surprisal and constancy of entropy rate indicated that writers with higher L2 proficiency can reduce the expected uncertainty of language production while still conveying informative content. However, the uniformity of information distribution showed less variability among different groups of L2 speakers, suggesting that this feature may be universal in L2 essay writing and less affected by L2 writers' variability in L1 background and L2 proficiency.

information, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2411.0355

Country:

North America > United States > Pennsylvania (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

Add feedback

An Open-Source American Sign Language Fingerspell Recognition and Semantic Pose Retrieval Interface

Thomas, Kevin Jose

arXiv.org Artificial IntelligenceAug-17-2024

This paper introduces an open-source interface for American Sign Language fingerspell recognition and semantic pose retrieval, aimed to serve as a stepping stone towards more advanced sign language translation systems. Utilizing a combination of convolutional neural networks and pose estimation models, the interface provides two modular components: a recognition module for translating ASL fingerspelling into spoken English and a production module for converting spoken English into ASL pose sequences. The system is designed to be highly accessible, user-friendly, and capable of functioning in real-time under varying environmental conditions like backgrounds, lighting, skin tones, and hand sizes. We discuss the technical details of the model architecture, application in the wild, as well as potential future enhancements for real-world consumer applications.

application, interface, sign language, (16 more...)

arXiv.org Artificial Intelligence

2408.09311

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Burnaby (0.04)
Europe > Finland > Pirkanmaa > Tampere (0.04)

Genre: Research Report (0.50)

Industry: Education > Curriculum > Subject-Specific Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Universal Gloss-level Representation for Gloss-free Sign Language Translation and Production

Hwang, Eui Jun, Cho, Sukmin, Lee, Huije, Yoon, Youngwoo, Park, Jong C.

arXiv.org Artificial IntelligenceJul-3-2024

Sign language, essential for the deaf and hard-of-hearing, presents unique challenges in translation and production due to its multimodal nature and the inherent ambiguity in mapping sign language motion to spoken language words. Previous methods often rely on gloss annotations, requiring time-intensive labor and specialized expertise in sign language. Gloss-free methods have emerged to address these limitations, but they often depend on external sign language data or dictionaries, failing to completely eliminate the need for gloss annotations. There is a clear demand for a comprehensive approach that can supplant gloss annotations and be utilized for both Sign Language Translation (SLT) and Sign Language Production (SLP). We introduce Universal Gloss-level Representation (UniGloR), a unified and self-supervised solution for both SLT and SLP, trained on multiple datasets including PHOENIX14T, How2Sign, and NIASL2021. Our results demonstrate UniGloR's effectiveness in the translation and production tasks. We further report an encouraging result for the Sign Language Recognition (SLR) on previously unseen data. Our study suggests that self-supervised learning can be made in a unified manner, paving the way for innovative and practical applications in future research.

dataset, translation, uniglor, (10 more...)

arXiv.org Artificial Intelligence

2407.02854

Country:

North America > United States (0.04)
Europe > Russia (0.04)
Asia > South Korea > Gyeongsangbuk-do > Pohang (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language Production from Text

Yin, Aoxiong, Li, Haoyuan, Shen, Kai, Tang, Siliang, Zhuang, Yueting

arXiv.org Artificial IntelligenceJun-11-2024

In this work, we propose a two-stage sign language production (SLP) paradigm that first encodes sign language sequences into discrete codes and then autoregressively generates sign language from text based on the learned codebook. However, existing vector quantization (VQ) methods are fixed-length encodings, overlooking the uneven information density in sign language, which leads to under-encoding of important regions and over-encoding of unimportant regions. To address this issue, we propose a novel dynamic vector quantization (DVA-VAE) model that can dynamically adjust the encoding length based on the information density in sign language to achieve accurate and compact encoding. Then, a GPT-like model learns to generate code sequences and their corresponding durations from spoken language text. Extensive experiments conducted on the PHOENIX14T dataset demonstrate the effectiveness of our proposed method. To promote sign language research, we propose a new large German sign language dataset, PHOENIX-News, which contains 486 hours of sign language videos, audio, and transcription texts.Experimental analysis on PHOENIX-News shows that the performance of our model can be further improved by increasing the size of the training data. Our project homepage is https://t2sgpt-demo.yinaoxiong.cn.

dataset, sequence, sign language, (14 more...)

arXiv.org Artificial Intelligence

2406.07119

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
Asia > China > Zhejiang Province (0.04)

Genre: Research Report (0.50)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)

Add feedback

What is a word?

Murphy, Elliot

arXiv.org Artificial IntelligenceFeb-19-2024

"Despite 2,400 years or so of trying, it is unclear that anyone has ever come up with an adequate definition of any word whatsoever, even the simplest." Surprisingly few linguists and philosophers have a clear model of what a word is, even though words impact basically every aspect of human life. Researchers that regularly publish academic papers about language often rely on outdated, or inaccurate, assumptions about wordhood. As in all scientific disciplines, we have two notions to consider: 1. Our intuitive concept of'word' (which we all have, even though it can be vague, and sometimes hard to articulate fully, like most complex concepts). This is no different from other scientific concepts - for example, 'water' has a very intuitive meaning, but it also is linked to much more technical, formal notions emerging from chemistry and physics (Murphy 2023). This short pedagogical document outlines what the lexicon is most certainly not (though is often mistakenly taken to be), what it might be (based on current good theories), and what some implications for experimental design are. The central features of lexical items have no connection with sensorimotor instructions.

instruction, lexicon, syntax, (16 more...)

arXiv.org Artificial Intelligence

2402.12605

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
North America > United States > Texas > Harris County > Houston (0.04)
(2 more...)

Genre: Research Report (0.70)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.49)

Add feedback